An algorithm based on graph theory for the assembly of contigs in physical mapping of DNA

نویسندگان

  • P. Zhang
  • E. A. Schon
  • S. G. Fischer
  • E. Cayanis
  • J. Weiss
  • S. Kistler
  • Philip E. Bourne
چکیده

An algorithm is described for mapping DNA contigs based on an interval graph (IG) representation. In general terms, the input to the algorithm is a set of binary overlapping relations among finite intervals spread along a real line, from which the algorithm generates sets of ordered overlapping fragments spanning that line. The implications of a more general case of the IG, called a probe interval graph (PIG), in which only a subset of cosmids are used as probes, are also discussed. In the specific case of cosmids hybridizing to regions of a YAC, the algorithm takes cross-hybridization information using the cosmids as probes, and orders them along the YAC; if gaps exist due to insufficient coverage of cosmid contigs along the length of the YAC, repetitive use of the algorithm generates sets of ordered overlapping fragments. Both the IG and the PIG can expose problems caused by false overlaps, such as hybridizations due to repetitive elements. The algorithm, has been coded in C; CPU time is essentially linear with respect to the number of cosmids analyzed. Results are presented for the application of a PIG to cosmid contig assembly along a human chromosome 13-specific YAC. An alignment of 67 cosmids spanning a YAC took 0.28 seconds of CPU time on a Convex 220 computer.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of Short Read Sequences for de novo Transcriptome Assembly

Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...

متن کامل

Sequencing Mixed Model Assembly Line Problem to Minimize Line Stoppages Cost by a Modified Simulated Annealing Algorithm Based on Cloud Theory

This research presents a new application of the cloud theory-based simulated annealing algorithm to solve mixed model assembly line sequencing problems where line stoppage cost is expected to be optimized. This objective is highly significant in mixed model assembly line sequencing problems based on just-in-time production system. Moreover, this type of problem is NP-hard and solving this probl...

متن کامل

An algorithm for integrated worker assignment, mixed-model two-sided assembly line balancing and bottleneck analysis

This paper addresses a multi-objective mixed-model two-sided assembly line balancing and worker assignment with bottleneck analysis when the task times are dependent on the worker’s skill. This problem is known as NP-hard class, thus, a hybrid cyclic-hierarchical algorithm is presented for solving it. The algorithm is based on Particle Swarm Optimization (PSO) and Theory of Constraints (TOC) an...

متن کامل

A multi agent method for cell formation with uncertain situation, based on information theory

This paper assumes the cell formation problem as a distributed decision network. It proposes an approach based on application and extension of information theory concepts, in order to analyze informational complexity in an agent- based system, due to interdependence between agents. Based on this approach, new quantitative concepts and definitions are proposed in order to measure the amount of t...

متن کامل

A New Method for Characterization of Biological Particles in Microscopic Videos: Hypothesis Testing Based on a Combination of Stochastic Modeling and Graph Theory

Introduction Studying motility of biological objects is an important parameter in many biomedical processes. Therefore, automated analyzing methods via microscopic videos are becoming an important step in recent researches. Materials and Methods In the proposed method of this article, a hypothesis testing function is defined to separate biological particles from artifact and noise in captured v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computer applications in the biosciences : CABIOS

دوره 10 3  شماره 

صفحات  -

تاریخ انتشار 1994